31 research outputs found

    Are v1 simple cells optimized for visual occlusions? : A comparative study

    Get PDF
    Abstract: Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of ‘globular’ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of ‘globular’ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of ‘globular’ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of ‘globular’ fields well. Our computational study, therefore, suggests that ‘globular’ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex. Author Summary: The statistics of our visual world is dominated by occlusions. Almost every image processed by our brain consists of mutually occluding objects, animals and plants. Our visual cortex is optimized through evolution and throughout our lifespan for such stimuli. Yet, the standard computational models of primary visual processing do not consider occlusions. In this study, we ask what effects visual occlusions may have on predicted response properties of simple cells which are the first cortical processing units for images. Our results suggest that recently observed differences between experiments and predictions of the standard simple cell models can be attributed to occlusions. The most significant consequence of occlusions is the prediction of many cells sensitive to center-surround stimuli. Experimentally, large quantities of such cells are observed since new techniques (reverse correlation) are used. Without occlusions, they are only obtained for specific settings and none of the seminal studies (sparse coding, ICA) predicted such fields. In contrast, the new type of response naturally emerges as soon as occlusions are considered. In comparison with recent in vivo experiments we find that occlusive models are consistent with the high percentages of center-surround simple cells observed in macaque monkeys, ferrets and mice

    ProSper -- A Python Library for Probabilistic Sparse Coding with Non-Standard Priors and Superpositions

    Get PDF
    ProSper is a python library containing probabilistic algorithms to learn dictionaries. Given a set of data points, the implemented algorithms seek to learn the elementary components that have generated the data. The library widens the scope of dictionary learning approaches beyond implementations of standard approaches such as ICA, NMF or standard L1 sparse coding. The implemented algorithms are especially well-suited in cases when data consist of components that combine non-linearly and/or for data requiring flexible prior distributions. Furthermore, the implemented algorithms go beyond standard approaches by inferring prior and noise parameters of the data, and they provide rich a-posteriori approximations for inference. The library is designed to be extendable and it currently includes: Binary Sparse Coding (BSC), Ternary Sparse Coding (TSC), Discrete Sparse Coding (DSC), Maximal Causes Analysis (MCA), Maximum Magnitude Causes Analysis (MMCA), and Gaussian Sparse Coding (GSC, a recent spike-and-slab sparse coding approach). The algorithms are scalable due to a combination of variational approximations and parallelization. Implementations of all algorithms allow for parallel execution on multiple CPUs and multiple machines for medium to large-scale applications. Typical large-scale runs of the algorithms can use hundreds of CPUs to learn hundreds of dictionary elements from data with tens of millions of floating-point numbers such that models with several hundred thousand parameters can be optimized. The library is designed to have minimal dependencies and to be easy to use. It targets users of dictionary learning algorithms and Machine Learning researchers

    ProSper -- A Python Library for Probabilistic Sparse Coding with Non-Standard Priors and Superpositions

    Get PDF
    ProSper is a python library containing probabilistic algorithms to learn dictionaries. Given a set of data points, the implemented algorithms seek to learn the elementary components that have generated the data. The library widens the scope of dictionary learning approaches beyond implementations of standard approaches such as ICA, NMF or standard L1 sparse coding. The implemented algorithms are especially well-suited in cases when data consist of components that combine non-linearly and/or for data requiring flexible prior distributions. Furthermore, the implemented algorithms go beyond standard approaches by inferring prior and noise parameters of the data, and they provide rich a-posteriori approximations for inference. The library is designed to be extendable and it currently includes: Binary Sparse Coding (BSC), Ternary Sparse Coding (TSC), Discrete Sparse Coding (DSC), Maximal Causes Analysis (MCA), Maximum Magnitude Causes Analysis (MMCA), and Gaussian Sparse Coding (GSC, a recent spike-and-slab sparse coding approach). The algorithms are scalable due to a combination of variational approximations and parallelization. Implementations of all algorithms allow for parallel execution on multiple CPUs and multiple machines for medium to large-scale applications. Typical large-scale runs of the algorithms can use hundreds of CPUs to learn hundreds of dictionary elements from data with tens of millions of floating-point numbers such that models with several hundred thousand parameters can be optimized. The library is designed to have minimal dependencies and to be easy to use. It targets users of dictionary learning algorithms and Machine Learning researchers

    When Does Re-initialization Work?

    Full text link
    Re-initializing a neural network during training has been observed to improve generalization in recent works. Yet it is neither widely adopted in deep learning practice nor is it often used in state-of-the-art training protocols. This raises the question of when re-initialization works, and whether it should be used together with regularization techniques such as data augmentation, weight decay and learning rate schedules. In this work, we conduct an extensive empirical comparison of standard training with a selection of re-initialization methods to answer this question, training over 15,000 models on a variety of image classification benchmarks. We first establish that such methods are consistently beneficial for generalization in the absence of any other regularization. However, when deployed alongside other carefully tuned regularization techniques, re-initialization methods offer little to no added benefit for generalization, although optimal generalization performance becomes less sensitive to the choice of learning rate and weight decay hyperparameters. To investigate the impact of re-initialization methods on noisy data, we also consider learning under label noise. Surprisingly, in this case, re-initialization significantly improves upon standard training, even in the presence of other carefully tuned regularization techniques.Comment: Published in PMLR Volume 187; spotlight presentation at I Can't Believe It's Not Better Workshop at NeurIPS 202

    Large-scale parallelized learning of nonlinear sparse coding models

    No full text
    The human brain is an unparalleled system: Through millions of years of evolution and during a lifespan of learning, our brains have developed remarkable abilities for dealing with incoming sensory data, extracting structure and useful information, and finally drawing the conclusions that result in the actions we take. Understanding the principles behind this machinery and building artificial systems that mimic at least some of these capabilities is a long standing goal in both the scientific and the engineering communities. While this goal still seems unreachable, we have seen tremendous progress when it comes to training data-driven algorithms on vast amounts of training data, e.g. to learn an optimal data model and its parameters in order to accomplish some task. Such algorithms are now omnipresent: they are part of recommender systems, they perform speech recognition and generally build the foundation for many semi-autonomous systems. They start to be integral part of many technical systems modern technical societies rely on for their everyday functioning. Many of these algorithms were originally inspired by biological systems or act as models for sensory data processing in mammalian brains. The response properties of a certain population of neurons in the first stages of the mammalian visual pathway, for example, can be modeled by algorithms such as Sparse Coding (SC), Independent Component Analysis (ICA) or Factor Analysis (FA). These well established learning algorithms typically assume linear interactions between the variables of the model. Most often these relationships are expressed in the form of a matrix-vector products between a matrix with learned dictionary-elements (basis vectors as column vectors) and the latent variables of these models. While on the one hand this linear interaction can sometimes be justified by the physical process for which the machine learning model is proposed, it is on the other hand often chosen just because of its mathematical and practical convenience. From an optimal coding point of view though, one would generally expect that the ideal model closely reflect the core interactions of the system it is modeling. In vision for example, one of the dominant processes giving rise to our sensory percepts are occlusions. Occluding objects are omnipresent in visual scenes and it would not be surprising if the mammalian visual system would be optimized to process occluding structures in the visual data stream. Yet, the established mathematical models of the first stages of the visual processing path (like, e.g., SC, ICA or FA) all assume linear interactions between the active image components. In this thesis we will discuss new models that aim to approximate the effects of occluding components by assuming nonlinear interactions between their activated dictionary elements. We will present learning algorithms that infer optimal parameters for these models given data. In the experiments, we will validate the algorithms on artificial ground truth data and demonstrate their ability to recover the correct model parameters. We will show that the predictions made by these nonlinear models correspond better to the experimental data measured in-vivo than the predictions made by the established linear models. Furthermore, we systematically explore and compare a large space of plausible combinations of hyperparameters and preprocessing schemes in order to eliminate any effects of artefacts on the observed results. Training nonlinear sparse coding models is computationally more demanding than training linear models. In order to perform the numerical experiments described in this thesis we developed a software framework that facilitates the implementation of massive parallel expectation maximization (EM) based learning algorithms. This infrastructure was used for all experiments described in here, as well as by collaborators in projects we will not discuss. Some of the experiments required more than 1017 floating point operations and were run on a computer cluster running on up to 5000 CPU Cores in parallel. Our parallel framework enabled these experiments to be performed.Das menschliche Gehirn ist ein beeindruckendes System: Nach Milliarden Jahren Evolution und durch lebenslanges Lernen verarbeitet es effizient eine große Menge einströmender Sinneseindrücke und extrahieren daraus handlungsrelevante Informationen. Es lernt unaufhörlich Strukturen in den einströmenden Daten zu erkennen um in Zukunft anstehende Aufgaben besser erfüllen zu können. Dabei ist es fehlertolerant und arbeitet typischerweise auch bei sehr ungenauen und mehrdeutigen Sinneseindrücken beeindruckend fehlerfrei und sichert somit unser Überleben. In der Wissenschaft und in verschiedenen technischen Disziplinen ist es ein lang gehegter Wunsch, die Prinzipien und Konzepte hinter dieser Art der Informationsverarbeitung zu verstehen. Kaum ein anderes System wird so interdisziplinär untersucht wie unser Gehirn: von den molekularen und biochemischen Abläufen innerhalb einzelner Zellen, über das kollektive Verhalten von kleineren und größeren neuronalen Netzwerken bis hin zum modularen Aufbau des Gehirns. Auch in der Informatik, der Wissenschaft, die sich mit dem Studium und der Konstruktion informationsverarbeitender Systeme beschäftigt, hofft man künstliche Systeme zu entwickeln, die zumindest einen Teil dieser Eigenschaften von den biologischen Vorbildern erben. Zwar sind wir weit davon entfernt eine umfassende Theorie über die Funktionsweise der Informationsverarbeitung im Gehirn zu haben, allerdings hat in den letzten Jahren einige Fortschritte gegeben. Datengetriebene Algorithmen, Verfahren also, die zunächst auf Daten trainiert werden bevor sie angewendet werden können sind inzwischen weit verbreitet und bilden die Kernbestandteile von z.B. Spracherkennungssystemen, Internet Suchmaschinen, Empfehlungssystemen und einer Vielzahl autonomer Systeme. Die Mehrzahl dieser Algorithmen sind entweder auf die eine oder andere Art biologisch inspiriert, oder können zumindest als grobes Modell für die Informationsverarbeitung in Teilbereichen von biologischen Gehirnen dienen. Das Ansprechverhalten einer Population von Neuronen in der ersten Verarbeitungsstufe für visuelle Daten im Kortex von Säugetieren zum Beispiel, der Simple-Cells in V1, kann näherungsweise durch Algorithmen wie Sparse Coding (SC), Independent Component Analysis (ICA) oder Factor Analysis (FA) modelliert werden. Diese bekannten und weit verbreiteten Algorithmen gehen davon aus, dass für jeden Sinneseindruck eine gewisse Anzahl sogenannter Dictionary-Einträge aktiviert werden. Diese werden durch lineare Superposition überlagert und können die beobachteten Daten somit erklären bzw. rekonstruieren. Ziel der Lernphase eines solchen Algorithmus ist es, einen möglichst allgemeingültigen Satz von Dictionary-Einträgen zu finden, der geeignet ist alle beobachteten Sinneseindrücke bestmöglich zu kodieren. In gewisser Weise spannen die Dictionary-Einträge somit den Raum aller interpretierbaren Beobachtungen auf

    Percentages of globular receptive fields predicted by the computational models in comparison to <i>in vivo</i> measurements.

    No full text
    <p><b>A</b> Receptive fields predicted if occlusion-like superposition is assumed ( out of receptive fields are shown). <b>B</b> Receptive fields predicted if standard linear superposition is assumed ( out of receptive fields are shown). <b>C</b> Percentages of globular fields predicted by the occlusive model (MCA) and by the linear model (BSC) versus number of hidden units. The experiments for MCA (blue line) and BSC (green line) on DoG preprocessed image patches were repeated five times and the error bars extend two empirical standard deviations. Standard sparse coding (yellow line) on DoG processed data shows the lowest fraction of globular fields. To control for the influence of preprocssing, additional experiments were performed on ZCA whitened data (dashed blue and dashed green lines). The bold red line (and its error bar) shows the fraction of globular fields computed based on <i>in vivo</i> measurements of macaque monkeys <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003062#pcbi.1003062-Ringach1" target="_blank">[14]</a>. Dashed red lines show the fractions reported for ferrets <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003062#pcbi.1003062-Usrey1" target="_blank">[15]</a> and mice <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003062#pcbi.1003062-Niell1" target="_blank">[16]</a>.</p

    Illustration of the combination of image components, comparison with computational models of component combinations, and receptive field comparison.

    No full text
    <p><b>A</b> Image patch (bottom left) showing an intersection of two branches extracted from a grey-level natural scene image (adapted from the van Hateren natural image database <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003062#pcbi.1003062-vanHateren1" target="_blank">[58]</a> with permission from J. H. van Hateren). Preprocessed version of the image patch (bottom right) obtained by using a center-surround filter to model the preprocessing by the lateral geniculate nucleus. <b>B</b> Left: Two image patches manually generated from the grey-level patch in <b>A</b>. Each patch is dominated by one of the two crossing branches of the original patch. Middle: The preprocessed versions of the two patches (central parts). Right: Combination of the two preprocessed patches using an occlusive combination (top) and a standard linear combination (bottom). <b>C</b> Examples of globular and Gabor-like receptive fields measured in V1 of macaque monkeys (courtesy of D. Ringach), and examples of the two receptive field types predicted by the occlusive encoding model. <b>D</b> Percentages of globular receptive fields predicted by different models for hidden units compared to percentages of globular fields of <i>in vivo</i> recordings.</p

    Illustration of different superposition models and globular fields.

    No full text
    <p><b>A</b> Selection of typical preprocessed image patches. <b>B</b> Superposition of two Gabor fields as assumed by standard sparse coding with continuous priors (along with additive Gaussian noise after superposition). <b>C</b> Superposition of the same two Gabor fields if hidden units (prefactors) are binary. <b>D</b> Superposition of the Gabor fields if a point-wise maximum is used as superposition model.</p
    corecore